Get Started

Make Reinforcement Learning Easier

Experience Replay is widely used for off-policy reinforcement learning. With cpprb, you can start your experiment quickly without implementing troublesome replay buffer.

Fast & Flexible


Heavy calculation is implemented with C++ and Cython. cpprb is usually faster than Python naive implementation.


cpprb supports Ape-X on single computer. You don’t need to think problematic lock. cpprb locks only critical section internally well.

Flexible Environment

cpprb adopts flexible environment. Any numbers of Numpy compatible environment values can be stored.

Framework Free

You can build your own reinforcement learning algorithms together with your favorite deep learning library (e.g. TensorFlow, PyTorch).

Ecosystem & Community


Any questions, requests, and so on are welcome.


TF2RL provides a set of reinforcement learning algorithms for TensorFlow 2. TF2RL uses cpprb for off-policy algorithm.